open source

# open source

DeepSeek R1-0528

Deepseek R1 0528

DeepSeek R1-0528 is the latest version released by the well-known open-source large model platform DeepSeek, which has high-performance natural language processing and programming capabilities. Its release has attracted widespread attention due to its excellent performance in programming tasks, enabling it to accurately answer complex questions. The model supports various application scenarios and is an important tool for developers and AI researchers. It is expected that more detailed model information and user guides will be released subsequently to enhance its functionality and applicability.

Unmute

Unmute is an innovative tool for speech recognition and synthesis, designed to enable users to interact efficiently with AI through natural language. Its low-latency technology ensures a smooth user experience, suitable for scenarios requiring real-time feedback. The product will be released as open source to promote participation from more developers and users. The price has not yet been announced and is expected to adopt a combined free and paid model.

DMind

DMind-1 and DMind-1-mini are domain-specific large language models for Web3 tasks, providing higher domain accuracy, instruction-following capability, and professional understanding than other general models. DMind-1 has been fine-tuned with expert-curated Web3 data and aligned through reinforcement learning and human feedback, making it suitable for complex instructions and multi-turn dialogues, applicable in areas such as blockchain, DeFi, and smart contracts. DMind-1-mini, as a lighter version, aims to meet real-time and resource-efficient application scenarios, particularly for proxy deployment and chain tools. Product pricing and specific information require further confirmation.

artificial intelligence

OpenMemory MCP

OpenMemory is an open-source personal memory layer that provides private, portable memory management for large language models (LLMs). It ensures users have full control over their data, maintaining its security when building AI applications. This project supports Docker, Python, and Node.js, making it suitable for developers seeking personalized AI experiences. OpenMemory is particularly suited for users who wish to use AI without revealing personal information.

AgentCPM-GUI

AgentCPM-GUI is an open-source mobile large language model (LLM) agent designed to operate on Chinese and English applications, capable of automatically executing tasks based on user screen captures. Its main advantages lie in efficient GUI element understanding, enhanced reasoning ability, and precise support for Chinese applications. The development background of this technology is to enhance the user experience of intelligent agents on mobile devices, especially in handling complex tasks. This product is positioned to improve productivity on mobile devices and is suitable for all types of users.

intelligent agent

DeepSeek-R1-Distill-Llama-8B

Deepseek R1 Distill Llama 8B

DeepSeek-R1-Distill-Llama-8B is a high-performance language model developed by the DeepSeek team, based on the Llama architecture and optimized through reinforcement learning and distillation techniques. This model excels in reasoning, code generation, and multilingual tasks, and is the first model in the open-source community to enhance inference capabilities through pure reinforcement learning. It supports commercial use, allows modifications and derivative works, making it suitable for academic research and enterprise applications.

audiblez

Audiblez is a tool that leverages Kokoro's high-quality speech synthesis technology to convert standard eBooks (in .epub format) into .m4b format audiobooks. It supports multiple languages and voices, allowing users to complete the conversion through simple command-line operations, greatly enriching the eBook reading experience, especially in situations where reading isn't convenient, such as while driving or exercising. This tool was developed by Claudio Santini in 2025 and is open-source under the MIT License.

llama-ocr

An open-source npm library that offers free usage of Llama 3.2 Vision for OCR, supporting both local and remote images, with plans to support PDF files. Inspired by Zerox, it provides both free and paid interfaces.

Development and Tools

JavaVision

JavaVision is a comprehensive visual intelligent recognition project based on Java development. It not only achieves core functions such as PaddleOCR-V4 text recognition, YoloV8 object recognition, face recognition, and image search by image, but also can easily expand to other fields such as voice recognition, animal recognition, and security checks. The features of the project include the use of the SpringBoot framework, multifunctionality, high performance, reliability, easy integration, and flexibility. JavaVision aims to provide Java developers with a comprehensive visual intelligent recognition solution, allowing them to build advanced, reliable, and easily integrated AI applications with the familiar and beloved programming language.

AI image detection and recognition

Featured AI Tools

Jules AI

Jules は、自動で煩雑なコーディングタスクを処理し、あなたに核心的なコーディングに時間をかけることを可能にする異步コーディングエージェントです。その主な強みは GitHub との統合で、Pull Request(PR) を自動化し、テストを実行し、クラウド仮想マシン上でコードを検証することで、開発効率を大幅に向上させています。Jules はさまざまな開発者に適しており、特に忙しいチームには効果的にプロジェクトとコードの品質を管理する支援を行います。

開発プログラミング

NoCode

NoCode はプログラミング経験を必要としないプラットフォームで、ユーザーが自然言語でアイデアを表現し、迅速にアプリケーションを生成することが可能です。これにより、開発の障壁を下げ、より多くの人が自身のアイデアを実現できるようになります。このプラットフォームはリアルタイムプレビュー機能とワンクリックデプロイ機能を提供しており、技術的な知識がないユーザーにも非常に使いやすい設計となっています。

開発プラットフォーム

ListenHub

ListenHub は軽量級の AI ポッドキャストジェネレーターであり、中国語と英語に対応しています。最先端の AI 技術を使用し、ユーザーが興味を持つポッドキャストコンテンツを迅速に生成できます。その主な利点には、自然な会話と超高品質な音声効果が含まれており、いつでもどこでも高品質な聴覚体験を楽しむことができます。ListenHub はコンテンツ生成速度を改善するだけでなく、モバイルデバイスにも対応しており、さまざまな場面で使いやすいです。情報取得の高効率なツールとして位置づけられており、幅広いリスナーのニーズに応えています。

腾讯混元画像 2.0

腾讯混元画像 2.0

腾讯混元画像 2.0 は腾讯が最新に発表したAI画像生成モデルで、生成スピードと画質が大幅に向上しました。超高圧縮倍率のエンコード?デコーダーと新しい拡散アーキテクチャを採用しており、画像生成速度はミリ秒級まで到達し、従来の時間のかかる生成を回避することが可能です。また、強化学習アルゴリズムと人間の美的知識の統合により、画像のリアリズムと詳細表現力を向上させ、デザイナー、クリエーターなどの専門ユーザーに適しています。

OpenMemory MCP

OpenMemoryはオープンソースの個人向けメモリレイヤーで、大規模言語モデル（LLM）に私密でポータブルなメモリ管理を提供します。ユーザーはデータに対する完全な制御権を持ち、AIアプリケーションを作成する際も安全性を保つことができます。このプロジェクトはDocker、Python、Node.jsをサポートしており、開発者が個別化されたAI体験を行うのに適しています。また、個人情報を漏らすことなくAIを利用したいユーザーにお勧めします。

オープンソース

FastVLM

FastVLM は、視覚言語モデル向けに設計された効果的な視覚符号化モデルです。イノベーティブな FastViTHD ミックスドビジュアル符号化エンジンを使用することで、高解像度画像の符号化時間と出力されるトークンの数を削減し、モデルのスループットと精度を向上させました。FastVLM の主な位置付けは、開発者が強力な視覚言語処理機能を得られるように支援し、特に迅速なレスポンスが必要なモバイルデバイス上で優れたパフォーマンスを発揮します。

ピカは、ユーザーが自身の創造的なアイデアをアップロードすると、AIがそれに基づいた動画を自動生成する動画制作プラットフォームです。主な機能は、多様なアイデアからの動画生成、プロフェッショナルな動画効果、シンプルで使いやすい操作性です。無料トライアル方式を採用しており、クリエイターや動画愛好家をターゲットとしています。

LiblibAI

LiblibAIは、中国をリードするAI創作プラットフォームです。強力なAI創作能力を提供し、クリエイターの創造性を支援します。プラットフォームは膨大な数の無料AI創作モデルを提供しており、ユーザーは検索してモデルを使用し、画像、テキスト、音声などの創作を行うことができます。また、ユーザーによる独自のAIモデルのトレーニングもサポートしています。幅広いクリエイターユーザーを対象としたプラットフォームとして、創作の機会を平等に提供し、クリエイティブ産業に貢献することで、誰もが創作の喜びを享受できるようにすることを目指しています。

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase